Melting of genomic DNA: predictive modeling by nonlinear lattice dynamics 
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The melting behavior of long, heterogeneous DNA chains is examined within the framework of 
the nonlinear lattice dynamics based Peyrard-Bishop-Dauxois (PBD) model. Data for the pBR322 
plasmid and the complete T7 phage have been used to obtain model fits and determine parameter 
dependence on salt content. Melting curves predicted for the complete fd phage and the Yl and 
Y2 fragments of the (^X174 phage without any adjustable parameters are in good agreement with 
experiment. The calculated probabilities for single base-pair opening are consistent with values 
obtained from imino proton exchange experiments. 

PACS numbers: 87.14.ek. 87.15. A-, 87.15. Zg 



I. INTRODUCTION 

The phenomenon of thermal DNA denaturation can 
be interpreted in terms of mesoscopic, one-dimensional, 
statistical or nonlinear lattice-dynamical models. Mod- 
els of the first type, based on the equilibrium statisti- 
cal mechanics of the helix-coil transition, proposed and 
solved long ago by Poland and Scheraga (PS) ^j] have 
been extensively studied, refined and developed by sub- 
sequent research; this substantial effort, incorporating a 
large body of enthalpic data, has led to algorithms for the 
detailed prediction of melting profiles with a rich internal 
structure [3, 3|; ^-^ s- result, the microscopic parametriza- 
tion of this family of models is well advanced [4] . Models 
of the second type, first proposed by Peyrard, Bishop and 
Dauxois (PBD) [^-Q were largely motivated by the study 
of dynamical properties, in particular those associated 
with the local breathing of DNA which characterizes the 
initiation of the transcription process; in the limiting case 
of infinite, homogeneous chains they describe DNA de- 
naturation as an exact, one-dimensional thermodynamic 
phase transition Q, whose effective (observable) order is 
controlled by the strength of the nonlinear base-stacking 
interaction [9, [l3] and can be followed in detail by finite- 
size scaling analysis |11| . Curiously, although this type 
of model is uniquely suited [§, [l^l to describe the sharp 
multistep melting process taking place at long sequences, 
where the effects of cooperativity are most likely to be 
correctly modeled with a small set of coupling param- 
eters, comparisons of PBD-based DNA melting curves 
with experimental data have been confined to relatively 
short chains where multiphase melting is either absent 
[lj,[lj] or marginal |15l - [l7| . As a result, one of the major 
potential sources of experimental validation of the PBD 
model remains unexplored. 

In this paper I will present an analysis of melting pro- 
files based on the standard version of the PBD model 
[6| for a number of specific sequences with many thou- 
sands of base pairs. I will use the parameters obtained 
by fitting two sets of published experimental data cor- 
responding to different ionic contents, the pBR322 plas- 



mid and the T7 phage melting profiles, to estimate the 
parameters' variation with ionic content. This procedure 
provides model parameters for any ionic content in the 
usual experimental range and thus paves the way for pre- 
dicting the melting profile of any long DNA sequence. I 
present predicted melting profiles with no adjustable pa- 
rameters and compare them with experimental data in 
three cases: the complete fd-phage, and the Yl and Y2 
fragments of the 0X174 phage. The computed profiles, 
although not perfect, reflect accurately many of the ac- 
tual complexities of sequence-dependent melting. Fur- 
thermore, the set of parameters obtained by fitting the 
melting profiles can be used to predict the probability of 
single base-pair opening at ambient temperatures. The 
calculated values, in the order of 1 ppm, are in line with 
estimates obtained from imino proton exchange measure- 
ments [18]. In summary, the PBD model seems to provide 
an excellent mesoscopic, dynamically motivated frame- 
work for the quantitative description of genomic DNA 
melting; this could be of particular use in further work 
where the model enjoys a comparative advantage, e.g. in 
extending the study of local DNA fiexibility [l^ to longer 
sequences. 



II. MODEL NOTATION AND NUMERICAL 
PROCEDURE 

The configurational energy of a chain of N base pairs 
is 

JV-l N 

Hp=Y,W{y„y,+^)+Y,V,{y,) (1) 

where the transverse coordinate j/j represents the sepa- 
ration of the two bases at the jth site, the anharmonic 
elastic term 
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models the nonlinear base-stacking interaction, and the 
on-site Morse potential Vj{yj) = Dj{l — e""^*)^ de- 



scribes the combined effects of hydrogen-bonding, stack- 
ing and solvent acting on the jth base pair. I will use 
the same parameter values k = 0.00045 eV/A^ for the 
linear part of the base stacking interaction, b — 0.2^1^^ 
for the inverse range of the nonlinear base stacking, and 
aAT = 4:.2A^^, ace = 6.9A^^ for the inverse ranges 
of the Morse potentials for all sequences discussed in this 
paper; p, the relative strength of the nonlinear base stack- 
ing will be set equal to 50, which is roughly comparable 
to the ratio of accepted values for the bending stiffness 
of double-stranded and single-stranded DNA[20]. This 
leaves only the depths of the Morse potential Dat and 
Doc to be determined from the data (cf. below). Fur- 
thermore, I will assume that the chain is subject to free- 
end boundary conditions. 

Key thermodynamic quantities of interest are (i) the 
partition function 
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and (ii) the probability that the rth base pair is bound. 
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where ks is the Boltzmann constant, T the temperature 
and j/c is an appropriately chosen displacement which 
distinguishes the open {y > yc) from the closed (j/ < yc) 
state of a base pair. Unless otherwise stated, the choice 
will be yc = 2A. 

As has been extensively discussed in the literature (e.g. 
[13 |) the partition function ([3]) diverges because of the flat 
top of the Morse potential; however, this divergence - ul- 
timately responsible for the occurrence of an exact phase 
transition in the thermodynamic limit - is restricted to 
the disordered phase 21]; as a result, Q is always well 
defined - and can be shown to be numerically stable, i.e. 
independent of any upper cutoff of the integrations - for 
sufficiently long chains [22|. 

Numerical work was based on expanding [12, [2J] each 
factor in the partition function in terms of the complete 
set of eigenstates of the integral equation 
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with K{y,y') = e-[^(2^'2^')+^'*^(2')/2+^'4^(!^')/2l/'=s'^ serv- 
ing as a "reference" kernel. This transforms the multi- 
dimensional integrals in Eqs. (jSj and (J4]) into matrix 
products in the reduced space of eigenstates of (jS]) - typ- 
ically of dimension 30-100 if one keeps only eigenstates 
with A^/Ao > 10^^; the resulting matrix products can 
be calculated quite rapidly if one stores the intermediate 
results. Numerical computations have been performed 
by substituting the lower and upper limits of integration 
by —1.5 A (owing to the rapid vanishing of the kernel 



which originates in the repulsive core of the Morse po- 
tential) and 300A, respectively; extending the region of 
integration in either direction does not produce any de- 
tectable change in the resulting melting profiles. Eq. [5] 
has been discretized using Gauss-Legendre quadratures 
with a mesh of 1200 points. Further numerical details 
will be reported elsewhere. 



III. RESULTS 



by 



A. Melting profiles 



The fraction of unbound ("melted") base pairs is given 



= 1- 
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-Y 



(6) 



Fig. [T] shows calculated and experimental differential 
melting curves dO/dT vs. T for (i) the plasmid pBR322 
(4361 bps, GC content 53.8 %, Na"*" ion concentration 
Co = 0.075M, DV = 0.1255 eV, Dg.^. = 0.1655 eV), 
and (ii) the the T7 phage (39937 bps, 48.4% GC content, 
Na+ ion concentration c = 0.0195M, Dat = 0.1205 eV, 
Dgc — 0.1619 eV). The calculated melting curves, al- 
though not in perfect agreement with the experimental 
ones, reflect to a considerable extent the latter's com- 
plexities. Note in particular the multipeak structure on 
the left panel, and the correct overall shape of the melt- 
ing curve on the right panel. The superimposed melting 
map shown in the left panel (melting temperatures, in 
steps of 0.5 K, uniquely defined for each base pair via 
Pi\Tm{'iy\ = 1/2) exhibits the distinct vertical regions 
known from statistical (PS-type) theories of DNA de- 
naturation to coincide with peaks of the melting profile 
and to characterize cooperative melting of large domains, 
extending over hundreds of base pairs. 

A careful analysis of the critical behavior of pure AT 
and GC sequences reveals essentially linear dependence 
of the melting temperature on D in the range of interest. 
Since Tm is known [23| to depend logarithmically on c, it 
is natural to assume that 
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where a — AT, GC, and use the two sets of Morse depth 
parametes to extract estimates of the proportionality 
constants, k^x = 0.00855 eV and kqc = 0.00615 eV, 
respectively [25] . This allows the calculation of Dat and 
Dgc at any salt concentration, and therefore, in princi- 
ple, the prediction of melting behavior for any long DNA 
sequence. 

I have tested the validity of the above procedure in 
three cases. Fig. [2] compares predicted to experimen- 
tal melting profiles for the Yl and Y2 fragments of 
the (pXlTA phage. The Na+ ion concentration is c = 
0.195M and the corresponding Morse depths, based on 
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FIG. 1. (color online) Left panel: left y-axis, melting curve 
for the plasmid pBR322 (4361 bps), solid line PBD model 
calculation, dashed red line experimental data redrawn from 
[^; superimposed (dotted line) is the melting map, to be read 
as melting temperature vs. site (right vertical axis); right 
panel: computed melting profile (solid line, left y-axis) for 
the T7 phage (39937 bps); experimental data redrawn from 
[291 1 (dashed red line); also shown (dotted curve, almost in- 
distinguishable from the melting curve, right y-axis) is the 
temperature derivative of the entropy per site. 



dZl), Dat = 0.12905 and Dgc = 0.16805 respectively. 
Fig. [3] compares predicted to experimental melting pro- 
files for the fd-phage (6408 bps, 40.9% GC content). The 
Na+ ion concentration was [26j] c = 0.0195M, therefore 
the Morse depths used are the same those of the T7-phage 
(cf. above). In all three cases, PBD-based calculations 
with no adjustable parameters successfully predict melt- 
ing temperatures and profiles in considerable detail. 
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FIG. 2. (color online) left panel: Melting curves for Yl frag- 
ment (2746 bps) of the ^X174 phage, solid line PBD model 
prediction, dashed line experimental curve redrawn from ^]; 
right panel: same for Y2 fragment (1695 bps), experimen- 
tal data redrawn from [2| based on two different experiments 
(dashed line [H], dotted line [13). 



B. Longer sequences 

The numerical method used here can in principle deal 
with much longer sequences. The right panel of Fig. 
[3| shows the theoretical melting profile obtained for the 
full genome sequence of the bacterial endosymbiont Car- 
sonella ruddii (159662 bps, GC content 16.6%, shortest 
genome of any characterized bacteria jSOj) using - some- 
what arbitrarily - the same parameters as in the Yl and 
Y2 fragments (cf. above). The transition has a temper- 
ature width of almost 20 K; this is unusually broad for 
such a large system and originates in large scale fluctua- 
tions of locally averaged GC-content. 



C. Melting Entropy 

The entropic signature of the melting transition can 
be followed in some detail by looking at the temper- 
ature derivative ds/dT of the entropy per site s = 
[kB/N]d{T\nZN)/dT. The right panel of Fig. [U shows 
that the proportionality relationship 



ds , de 
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dT dT 
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holds to a great degree of accuracy (the two curves be- 
ing practically coincident), with As being defined as 
the entropy of melting. Within the PBD model frame- 
work, calorimetry and base-pair sensitive methods (e.g. 
UV absorption) deliver essentially identical profile in- 
formation. Numerical integration gives values As — 
4.46, 4.35, 4.41fcB, for the pBR322 plasmid, the fd and 
T7 phages respectively, and 4.40 ks for the Yl and 
Y2 fragments; although considerably lower than typi- 
cal experimental values 3] of 12fcB (corresponding to 24 
cal/mol(bp)/K), these values represent a significant im- 
provement over previous estimates, around Ifcs, obtained 
[lOj using oligomer-based [l^ parameters. 



D. Opening of individual base pairs 

Base pair opening has been carefully investigated [13] 
by making use of the imino proton exchange technique. 
The experiments suggest (i) a closed base pair lifetime in 
the order of a few milliseconds and (ii) equilibrium con- 
stants between open and closed states corresponding to a 
fraction of open base pairs of the order of 1 ppm (at room 
temperature). Since it is not immediately obvious what 
values oiy > He correspond to the open state in the imino 
proton experiment, I have plotted in Fig. [Jjthe melting 
fraction as a function of temperature for a range of p val- 
ues and two choices of critical amplitude ijc, equal to 2 A 
and 5^, respectively. The results indicate that the choice 
p = 50, He = ^A is consistent with the conclusions of the 
imino proton exchange experiments. It should of course 
be noted that the exact choice of j/c does not affect the 





FIG. 4. (color online) Fraction of unbound base pairs as 
a function of temperatures calculated for the T7-pliage for 
p = 2, 5, 10, 20, 50 and yc = 2A (dotted lines), j/c = 5 A (solid 
lines). Only the lowest curve (p — 50, i/c = 5^1) leads to results 
consistent with estimates 6 ~ 10^'' — 10~® obtained from 
imino proton exchange measurements at room temperature 
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FIG. 3. (color online) Left panel: Melting curve for the 
fd-phage (6408 bps), PBD model prediction (solid line) vs ex- 
periment (dashed line, redrawn from 26]). Right panel: The- 
oretical (PBD) melting profile of Carsoiiella Ruddii (159662 
bps). 



IV. CONCLUDING REMARKS 



overall shape or position of the melting profile (except for 
a sharpening of the edges). Nonetheless, it is reassuring 
to confirm that a reasonable choice of yc, when applied 
to the PDB model vi^ith the above (differential melting 
curve - based) paramctrization leads to room tempera- 
ture open base-pair populations which are in agreement 
with an entirely different class of experiments. 
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In summary, the nonlinear lattice dynamics (PBD) ap- 
proach to the statistical physics of DNA denaturation, in 
addition to elucidating the more abstract, qualitative as- 
pects of the phase transition, has been shown to provide a 
powerful tool for the detailed description of quantitative 
features of melting in long genomic sequences. 
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with Michel Peyrard. 
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